Body fat percentage

Group 012E01

Tudor Liu, Samuel Tsui, Shirley Wang, William Wang

Introduction

  • source: The Data And Story Library (DASL)

Introduction

Rows: 250
Columns: 16
$ Density <dbl> 1.0708, 1.0853, 1.0414, 1.0751, 1.0340, 1.0502, 1.0549, 1.0704…
$ Pct.BF  <dbl> 12.3, 6.1, 25.3, 10.4, 28.7, 20.9, 19.2, 12.4, 4.1, 11.7, 7.1,…
$ Age     <int> 23, 22, 22, 26, 24, 24, 26, 25, 25, 23, 26, 27, 32, 30, 35, 35…
$ Weight  <dbl> 154.25, 173.25, 154.00, 184.75, 184.25, 210.25, 181.00, 176.00…
$ Height  <dbl> 67.75, 72.25, 66.25, 72.25, 71.25, 74.75, 69.75, 72.50, 74.00,…
$ Neck    <dbl> 36.2, 38.5, 34.0, 37.4, 34.4, 39.0, 36.4, 37.8, 38.1, 42.1, 38…
$ Chest   <dbl> 93.1, 93.6, 95.8, 101.8, 97.3, 104.5, 105.1, 99.6, 100.9, 99.6…
$ Abdomen <dbl> 85.2, 83.0, 87.9, 86.4, 100.0, 94.4, 90.7, 88.5, 82.5, 88.6, 8…
$ Waist   <dbl> 33.54331, 32.67717, 34.60630, 34.01575, 39.37008, 37.16535, 35…
$ Hip     <dbl> 94.5, 98.7, 99.2, 101.2, 101.9, 107.8, 100.3, 97.1, 99.9, 104.…
$ Thigh   <dbl> 59.0, 58.7, 59.6, 60.1, 63.2, 66.0, 58.4, 60.0, 62.9, 63.1, 59…
$ Knee    <dbl> 37.3, 37.3, 38.9, 37.3, 42.2, 42.0, 38.3, 39.4, 38.3, 41.7, 39…
$ Ankle   <dbl> 21.9, 23.4, 24.0, 22.8, 24.0, 25.6, 22.9, 23.2, 23.8, 25.0, 25…
$ Bicep   <dbl> 32.0, 30.5, 28.8, 32.4, 32.2, 35.7, 31.9, 30.5, 35.9, 35.6, 32…
$ Forearm <dbl> 27.4, 28.9, 25.2, 29.4, 27.7, 30.6, 27.8, 29.0, 31.1, 30.0, 29…
$ Wrist   <dbl> 17.1, 18.2, 16.6, 18.2, 17.7, 18.8, 17.7, 18.8, 18.2, 19.2, 18…

Data description


Siri’s Equation:


\(PBF = \frac{495}D - 450\)

Data cleaning

                 [,1]     [,2]    [,3]     [,4]      [,5]     [,6]
Waist        33.54331 32.67717 34.6063 34.01575  39.37008 37.16535
Abdomen      85.20000 83.00000 87.9000 86.40000 100.00000 94.40000
Abdomen/2.54 33.54331 32.67717 34.6063 34.01575  39.37008 37.16535
  Pct.BF Age Weight  Height Neck Chest Abdomen   Hip Thigh Knee Ankle Bicep
1   12.3  23 69.967 172.085 36.2  93.1    85.2  94.5  59.0 37.3  21.9  32.0
2    6.1  22 78.585 183.515 38.5  93.6    83.0  98.7  58.7 37.3  23.4  30.5
3   25.3  22 69.853 168.275 34.0  95.8    87.9  99.2  59.6 38.9  24.0  28.8
4   10.4  26 83.801 183.515 37.4 101.8    86.4 101.2  60.1 37.3  22.8  32.4
5   28.7  24 83.574 180.975 34.4  97.3   100.0 101.9  63.2 42.2  24.0  32.2
6   20.9  24 95.368 189.865 39.0 104.5    94.4 107.8  66.0 42.0  25.6  35.7
  Forearm Wrist
1    27.4  17.1
2    28.9  18.2
3    25.2  16.6
4    29.4  18.2
5    27.7  17.7
6    30.6  18.8

Stepwise variable selection

            Estimate Std. Error t value Pr(>|t|)
(Intercept)  -32.574      8.363  -3.895    0.000
Abdomen        0.884      0.070  12.658    0.000
Weight        -0.236      0.076  -3.086    0.002
Wrist         -1.764      0.486  -3.627    0.000
Bicep          0.243      0.156   1.558    0.121
Age            0.062      0.031   2.012    0.045
Thigh          0.178      0.121   1.474    0.142
            Estimate Std. Error t value Pr(>|t|)
(Intercept)    5.040      8.359   0.603    0.547
Age            0.073      0.030   2.396    0.017
Height        -0.106      0.050  -2.125    0.035
Neck          -0.451      0.218  -2.073    0.039
Abdomen        0.823      0.069  11.958    0.000
Hip           -0.195      0.130  -1.501    0.135
Thigh          0.224      0.129   1.735    0.084
Forearm        0.296      0.192   1.542    0.124
Wrist         -1.731      0.494  -3.506    0.001

Comparing AIC

  Forward model Backward model
Predictors Estimates p Estimates p
(Intercept) -32.57 <0.001 5.04 0.547
Abdomen 0.88 <0.001 0.82 <0.001
Weight -0.24 0.002
Wrist -1.76 <0.001 -1.73 0.001
Bicep 0.24 0.121
Age 0.06 0.045 0.07 0.017
Thigh 0.18 0.142 0.22 0.084
Height -0.11 0.035
Neck -0.45 0.039
Hip -0.19 0.135
Forearm 0.30 0.124
Observations 250 250
R2 / R2 adjusted 0.742 / 0.736 0.747 / 0.739
AIC 1443.187 1442.736

Cross validation

Boxplot of rmse and mae

Means of rmse and mae

      forward backward
rmse 4.317421 4.316203
mae  3.590135 3.563110

Using Caret

Choose forward model

  Forward model Backward model
Predictors Estimates p Estimates p
(Intercept) -32.57 <0.001 5.04 0.547
Abdomen 0.88 <0.001 0.82 <0.001
Weight -0.24 0.002
Wrist -1.76 <0.001 -1.73 0.001
Bicep 0.24 0.121
Age 0.06 0.045 0.07 0.017
Thigh 0.18 0.142 0.22 0.084
Height -0.11 0.035
Neck -0.45 0.039
Hip -0.19 0.135
Forearm 0.30 0.124
Observations 250 250
R2 / R2 adjusted 0.742 / 0.736 0.747 / 0.739
AIC 1443.187 1442.736

Assumption Checking

Before using the model to predict the average body fat percentage. It is pivotal to ensure the model follows Linear regression assumptions.

Independence: the assumption is usually dealt within the experimental design phase - before data collection , so we are not assessing independence here.

Normality: qqplot of residuals are normally distributed.

Assumption Checking

Linearity: the relationship between fitted values and percentage body fat, residuals are linear

Assumption Checking

Linearity: the relationship between fitted values and percentage body fat, residuals are linear

Assumption Checking

Homoscedasticity: It assumes the error variance is constant for all fixed values.

Assumption Checking

Homoscedasticity: It assumes the error variance is constant for all fixed values.

Evidence to back up

interactive correlation matrix

Why it matters

As heath becomes a prevailing issue, the group members are interested in investigating the body fat percentage by predicting it indirectly through related variables.

What we discovered

Using AIC, the group has discovered that body fat percentage can be reflected by a linear model with independent variables:

  • Abdomen
  • Weight
  • Wrist
  • Bicep
  • Age
  • Thigh

Takeaway

  • Knowing your own body fat percentage can be simple and easy as it is
  • Tools needed:
  • A scale & another scale!

Limitations

  • sample size
  • Equations
  • AIC method